Entry Name:Data Star Observatory-Gao-MC2

VAST Challenge 2021
Mini-Challenge 2

 

 

Team Members:

Junting Gao, Data Star Observatory, gaojunting@shuziguanxing.com PRIMARY
Siqi Shen, Fudan University, 20210980083@fudan. edu. cn
Xingui Lai, Fudan University, 20210980095@fudan.edu.cn
Qinghong Wang, Data Star Observatory, wangqinghong@shuziguanxing.com
Jiaqi Dong, Fudan University, 20210980045@fudan.edu.cn
Ziyue Lin, Fudan University, ziyuelin917@gmail.com
Lei Peng, Fudan University 20210980132@fudan.edu.cn
Yijie Hou, Fudan University, 20210980058@fudan.edu.cn
Yuheng Zhao, Fudan University, yuhengzhao_cn@163.com
Siming Chen, Fudan University, simingchen@fudan.edu.cn


Student Team:No

 

Tools Used: D3, ThreeJS

 

Approximately how many hours were spent working on this submission in total?

About 200 hours ( 50 days, and 4 hours/day )

 

May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2021 is complete? Yes

 

Video

https://www.guanxingtai.net/vastchallenge2021/mc2.wmv

 

 

 

As a visual analytics expert assisting law enforcement, your mission is to identify which GASTech employees made which purchases and identify suspicious patterns of behavior.. You must cope with uncertainties that result from missing, conflicting, and imperfect data to make recommendations for further investigation.

Use visual analytics to analyze the available data and develop responses to the questions below. In addition, prepare a video that shows how you used visual analytics to solve this challenge. Submission instructions are available here. Entry forms are available for download below.

Questions

1 -- Using just the credit and loyalty card data, identify the most popular locations, and when they are popular. What anomalies do you see? What corrections would you recommend to correct these anomalies? Please limit your answer to 8 images and 300 words.

 

1.1 -- The most popular locations, and when they are popular.

To solve this problem, we design a Consumption View to describe the consumption behavior, shown in Fig 1.1.

晚上

Fig 1.1 Overview of Consumption View.

 

Considering the diversity of existing locations, we have divided them into six categories based on the type of services they provide, which are FastFood、Restaurant、Industry、Living、GasStation and Entertainment. According to the number of consumers and the level of per capita consumption, the most popular locations for each category will stand out.

Taking FastFood as an example, we will describe the exploration process of finding the most popular locations in detail below.

FastFood category contains 7 stores. Fig 1.1 shows the overview about the consumptions in FastFood category during the two weeks. We compare the consumptions of each store separately, in Fig1.2. Obviously, the number of people going to Brew've Been Served and the amount spent there are the highest. Therefore, from our perspective, Brew've Been Served is the most popular location in FastFood category and the most popular time period is 7:30-8:20 on weekdays.

Table 1.1 gives the results for all categories.

晚上

Fig 1.2 (a) Overview of Consumption Records in FastFood Category. (b) Consumptions of Each Store in FastFood Category.

 

Table 1.1 The Most Popular Locations and Their Most Popular Time Period

晚上

 

1.2 -- What anomalies do you see? What corrections would you recommend to correct these anomalies?

Here we analyze the anomalies in five dimensions, i.e., consumption time, consumption amount, consumption type, consumption mode and others. Here also defines Lxxxx as the consumer whose last four digits of his credit card is xxxx.

Anomalies in consumption time:

a) Multiple consumption occurred in Kronos Mart around 3:00 am, shown in Fig 1.3

On January 12, L5224 consumed at 03:39. On January 13, L4034 consumed at 03:00. On January 19, L2490 consumed at 03:13. L5777 consumed at 03:45 and L2070 consumed at 03:48.

晚上

Fig 1.3 Consumption occurred in Kronos Mart around 3:00

 

Coffee shack, Bean There Dene That, Jack's Magical Beans and Brewed Awakenings belong to FastFood category. However, consumptions at these stores all occurs at 12:00, shown in Fig 1.4.

晚上

Fig 1.4 Unusual consumption times at several stores of FastFood category

 

c) The consumption records of L5924, L3295 and L9362 are suspicious, in Fig 1.5.

L5924 had a record of consumption only on January 10, and all the stores involved belong to Industry category. More specifically, it had consumption record at 7:59 at the airport, at 09:09 at Carlyle Chemical Inc, at 10:40 at Nationwide Refinery, and at 0:10 at Stewart and Sons Fabrication.

L3295 had a record of consumption only before January 10.

L9362 had a record of consumption only after January 12.

晚上

Fig 1.5 Suspicious consumption records of L5924, L3295 and L9362.

 

Anomalies in consumption amount:

a) At 19:20 on January 3rd, Frydos Autosupply n's More had an unusual $10,000 purchase and the purchase was recorded from only one credit card, in Fig 1.6.

晚上

Fig 1.6 An unusual $10,000 purchase in Frydos Autosupply n's More

 

Anomalies in consumption type:

a) L9633,L5485,L2769,L5777,L3317,L2490 and L5924 only had consumption records in stores belonging to Industry category. Each record involved a huge amount of money, with the consumption occurring during working hours., shown in Fig 1.7(a)

Anomalies in consumption mode:

Abnormal consumption mode is mainly reflected in the consumption frequency and consumption location.

a) L2769 had consumption records only at the airport. There are 7 days with consumption records twice a day, shown in Fig 1.7(b).

b) The consumption frequency of L5485 and L9633 may be abnormal. They both showed that someone spent money at Nationwide Refinery around 10 a.m. and then immediately went to Stewart and Sons Fabrication around 11:00, shown in Fig 1.7(c).

晚上

Fig 1.7 (a) Abnormal Consumption Type. (b)Cconsumptio Recordsn of L2769. (c) CconsumptionRecords of L5485 and L9633.

 

Others:

a) CoffeeShack had only one customer, L6417, and this customer only spent money at the store at 12:00 every weekday.

晚上

Fig 1.8 CoffeeShack's only customer

 

1.3-- What corrections would you recommend to correct these anomalies?

Matching the two cards according to the time, location and amount can fix some anomalies in consumption time and consumption amount. Consider that there are still some consumption records can't match in different cards, we devide them into three types.

type=right: use both card and the consumption amount is consistent

type=justCC:only use credit card

type=cash:only use loyalty card

2 -- Add the vehicle data to your analysis of the credit and loyalty card data. How does your assessment of the anomalies in question 1 change based on this new data? What discrepancies between vehicle, credit, and loyalty card data do you find? Please limit your answer to 8 images and 500 words.

 

2.1-- How does your assessment of the anomalies in question 1 change based on this new data?

a) Abnormal consumption in Kronos Mart (anomalies in consumption time in sec 1.2)

After adding the vehicle data, it was found that there was no car stopping at Kronos Mart from 3:00 to 4:00, but from 18:00 to 20:00, shown in Fig 2.1. Therefore, we suspect that this is an anomaly caused by an error in the store's system.

晚上

Fig 2.1 Correction of Abnormal Consumption in Kronos Mart

 

b) Abnormal consumption in Coffee shack, Bean ThereDene That, Jack's Magical Beans, Brewed Awakenings (anomalies in consumption time in Sec 1.2)

In Fig 2.2, the vehicles stay at all four stores around 7:00-8:30, so we suspect a system error in these stores.

晚上

Fig 2.2 (a) Consumption records. (b) Stop records of vehicles in the vicinity of Coffee shack, Bean ThereDene That, Jack's Magical Beans and Brewed Awakenings.

 

c) Suspicious consumption records of L3295 and L9362 (anomalies in consumption time in Sec 1.2)

In Sec 1.2, we found that L3295 had a consumption record only before January 10 and L9362 had a consumption record only after January 12. The driving track of car No.29 matches with the consumption track of L3295 and L9362, and we infer that the owner of car No.29 is the owner of L3295 and L9362. Fig 2.3(a) indicates that this car had driving records on January 11. However, there is no consumption record on January 11, so we inferred that the car owner lost the card L3295 on January 11 and replaced the card L9362 after January 12.

晚上

Fig 2.3 (a)Track of car No.29 (b) Consumption records of L3295 and L9362

 

d) Anomalies in consumption type in Sec 1.2

By matching vehicle data and consumption information, we found that the owners of cards with consumption records only at stores in the Industry category are truck drivers.

晚上

Fig 2.4 Track of truck drivers

 

e) Anormal consumption mode in L2769 in Sec 1.2

The consumption record of L2769 corresponds to the track of truck 104, shown in Fig 2.5, and we infer that the truck 104 is an airport-only line.

晚上

Fig 2.5 (a) Consumption records of Car 104. (b)Track of truck 104

 

2.2-- What discrepancies between vehicle, credit, and loyalty card data do you find?

a) Inconsistency of vehicle data and consumption data.

See more details in Sec 2.1 (a-b) and Fig 2.6.

b) Inconsistency of credit and loyalty card data.

Mismatch between credit card and loyal card in certain consumption amounts exists. There are cases where credit cards or loyal cards are used separately. For example, Fig 2.7 shows the owner of L4164 only used credit card in Desafio Golf Course on January 12.

In addition, Adra Nubarron had two purchases at Shoppers' Delight at around 20:00 on January 17. However, the loyal recorded $269.33 and the credit card recorded $289.33 on January 17, shown in Fig 2.8(a).

On January 11, the loyal card of car 20 recorded $55.36 in Guy's Gyros and the credit card recorded $55.69. (b), shown in Fig 2.8(b).

On January 7, the loyal card of car 7 recorded $11.59 and the credit card recorded $71.59, shown in Fig 2.8(c).

晚上

Fig 2.6 The consumption and track of the Car 5 on January 18.

 

晚上

Fig 2.7 Credit cardsandr loyal cards are used separately in L4164

 

晚上

Fig 2.8 (a) Mismatch between records of credit card and that of loyal card of Car 22 in Shopper's Delight on January 17. (b) Mismatch between records of credit card and that of loyal card of Car 20 in Guy's Gyros on January 11. (c) Mismatch between records of credit card and that of loyal card of Car 7 in Brewed Awakenings on January 7.

 

3 -- Can you infer the owners of each credit card and loyalty card? What is your evidence? Where are there uncertainties in your method? Where are there uncertainties in the data? Please limit your answer to 8 images and 500 words.

 

3.1-- Can you infer the owners of each credit card and loyalty card? What is your evidence?

In the car-assignment file, there are 35 employees with cars and 9 truck drivers. And there are 54 loyal cards. The relationship between credit card and loyalty card can be obtained by matching the consumption location and amount. Then, by matching trajectory and consumption records in time, we finally found that

a) The 35 employees with cars can be correctly matched to the credit and loyalty cards that belong to them. Among them, car No.5 has two loyalty cards. See table 3.1 for details.

b) Due to the lack of information on the relationship between trucks and truckers in the data, we can only get the relationship between 9 credit cards and 5 trucks in the end, but not the relationship between credit cards and these truckers. See table 3.2 for details.

c) There are still 9 loyalty cards for which the corresponding owner information cannot be found, shown in table 3.3.

Fig 3.1 shows the process how we found the relationship. The relationship between cars and cards was first obtained through data processing,The specific process is as follows.

(Step 1): Match the loyalty cards and credit cards based on consumption locations and consumption amounts.

(Step 2): Determine the stop points of each trajectory. Each stop point information includes coordinates, start time, end time, etc.

(Step 3): Match the loyalty cards and cars according to the information of stop poins.

(Step 4): According to the car-assignment file, we finally determine the correspondence between employees, loyalty cards and credit cards.

Then the Trajectory Analysis View and Consumption View were applied to verify the results.

晚上

Fig 3.1 Process of Inferring the Owners of Each Credit Card and Loyalty Card

 

Table 3.1 Exact match results (36 cards, 35 people)

晚上

 

Table 3.2 Credit Card Information for Truckers (9 Cards, 9 People)

晚上

 

Table 3.3 Loyal Cards Not Matched to Owner Information

晚上

 

晚上

Fig 3.2 Consumption Records of Loyalty Cards Not Matched with Owner Information

 

3.2-- Where are there uncertainties in your method?

Our method assumed that all cars are in most cases used by one person. Errors may occur if more than one person use the same car. In addition, there may be inconsistencies between the track and the consumption locations, such as in the case of card theft.

a) Card theft in Frydos Autosupply n' More

At 19:20 on January 13, a large amount of consumption was recorded in credit card of Car No.1 at Frydos Autosupply n' More. However, Fig 3.3 indicates that car No.1 was located at Ouzeri Elian at that time and the owner of this car might be for dinner. The only car that went to Frydos Autosupply n' More at that time was car No.24, shown in Fig 3.4, and the owner of car No.24 used only the loyal card for consumption. So it is suspected that theft has occurred.

晚上

Fig 3.3 The track and consumption of car 1 on January 13th

 

晚上

Fig 3.4 The track and consumption of car 24 on January 13th

 

3.3-- Where are there uncertainties in the data?

a) Anomalies in consumption time shown in Sec 1.2 (a).

Missing trajectory data of car 9 and noise trajectory data of car 28, shown in Fig 3.5 (a-b).

c) There are deviations between the building coordinates given by map.jpg and the coordinates obtained from the track data. Several deviation locations are marked in Fig 3.6.

d) Inconsistency of credit and loyalty card data shown in Sec 2.2 (b).

晚上

Fig 3.5 (a) Missing Trajectory Data of car 9. (b) The track data of car 28 has more noise.)

 

晚上

Fig 3.6 Deviations between the building coordinates given by map.jpg and the coordinates obtained from the track data.

 

4 -- Given the data sources provided, identify potential informal or unofficial relationships among GASTech personnel. Provide evidence for these relationships. Please limit your response to 8 images and 500 words.

 

4.1-- Roommates

Lidelse Dedos and Birgitta Frente may be roommates with evidence as follows.

Lidelse Dedos and Birgitta Frente may be roommates. Here are the evidence.

a) Their homes locate in the same place.

b) Their consumptions sometimes overlaps.

c) Their car trajectories show similar stopping places.

Fig 4.1 shows that both of them had breakfast at Hallowed Grounds, except one at Bean There Done That on January 13th.

Table 4.1 Information about Lidelse Dedos and Birgitta Frente

晚上

 

晚上

Fig 4.1 Trajectories of Lidelse Dedos and Birgitta Frente during two weeks

 

4.2-- Friends

a) Hennie Osvaldo may be friend of Lidelse Dedos and Birgitta Frente.

Hennie Osvaldo often went to the home of Lidelse Dedos and Birgitta Frente, 11 times in total, and five times overnight, shown in Fig 4.3.

Table 4.2 Information about Lidelse Dedos, Birgitta Frente and Hennie Osvaldo

晚上

 

晚上

Fig 4.2 Trajectories of Hennie Osvaldo during the two weeks

 

4.3-- Members of a certain secret organization

Inga Ferro, Loreto Bodrogi, Isia Vann and Hennie Osvaldo may be members of a secret organization. Here are the evidence.

a) They both are staff of Security with similar locations of homes.

b) Cars 15, 16, 21 were involved in abnormal surveillance activities.

c) Car 13, 15 and 21 went to unknown6, unknown7, unknown8, unknown9 and unknown10 from 11:00 to 12:00 respectively.

Table 4.3 Employee Information of possible members of a secret organization

晚上

 

晚上

Fig 4.3 Trajectories(a) and consumption records(b) of car 13, 15, 16 and 21

 

4.4-- Lovers

4.4.1-- Elsa Orilla & Brand Tempestad (Lovers)

a) They went to Chostus Hotel after 11:00 on the January 8th, 10th, 14th and 17th, and stayed until 14:00.

b) On January 6th, they had lunch together in Ouzerielian and only Elsa Orilla's card had a corresponding consumption record.

c) At 19:00 on January 15th, they went to Frydos Autosupply n'More at the same time and left at 20:00.

Table 4.4 Information about Elsa Orilla and Brand Tempestad

晚上

 

晚上

Fig 4.4 Similar consumption (a) and driving(b) trajectory between Elsa Orilla and Brand Tempestad.

 

4.4.2-- Adra Nubarron & Isande Borrasca (Lovers)

a) Share cars for any times

Isande Borrasca took Adra Nubarron's car to Abila Zacharo on January 6th, Kalami Kafenion on January 7th, Gelatogalore on January 8th, Kalami Kafenion on January 10th, Abila on January 13th Zacharo, January 15th to Katerinas Caf, January 16th to Abila Zacharo, and January 19th to Guy's Gyros.

Adra Nubarron took Isande Borrasca's car to Abila Zacharo on January 9th and to Hippokampos on January 17th.

b) Share cards for consumptions

Almost all of the bills were paid by Adra Nubarron's credit card, and Isande Borrasca offered the loyalty card.

Table 4.5 Information about Adra Nubarron and Isande Borrasca

晚上

 

晚上

Fig 4.5 Similar trajectories of Isande Borrasca(car 28) and Adra Nubarron(car 22) for many days.

 

4.5-- Friend's Party

Location: Lars Azada's hom

Time: At 12:20 on January 9th

Table 4.6 Party Attendee Information

晚上

 

晚上

Fig 4.6 Stop point records for party attendees

 

5 -- Do you see evidence of suspicious activity? Identify 1- 10 locations where you believe the suspicious activity is occurring, and why Please limit your response to 10 images and 500 words.

 

5.1-- Abnormal surveillance activities (4 times in total)

 

a) Surveillance location A: Ada Campo Corrente's home

Evidence: At 23:00 on January 6th, Isia Vann (car 16) went near Ada Campo Corrente's home (home_10), shown in Fig 5.1(a). At 3:35 on January 7th, Loreto bodrogi (car 15) also came there. They left one after another around 7:30 on January 7th, shown in Fig 5.1(b). The identities of the three are sensitive and the activity time is not normal.

Table 5.1 Participant information of surveillance location A

晚上

 

晚上

Fig 5.1 Trajectories of Isia Vann (a) and Loreto Bodrogi (b) on January 6-7th.

 

b) Surveillance location B: OrhanOrhan Strum's home

Evidence: At 23:06 on January 8th, car 24 went near car 32's home and didn't leave until 3:30 on January 9th, shown in Fig 5.2(a). Two minutes later, car 15 went near car 32's home and didn't leave until 7:23 on January 9th. We suspected that car 15 had changed shifts with car 24, shown in Fig 5.2(b).

Table 5.2 Participant information of surveillance location B

晚上

 

晚上

Fig 5.2 Trajectories of Loreto Bodrogi and Isia VannMinke Mies on January 8-9th

 

c) Surveillance location C: OrhanOrhan StrumWillem Vasco-Pais's home

Evidence: At 23:07 on January 10th, car 16 went near car 35's home and didn't leave until 3:23 on January 11th, shown in Fig 5.3(a). Eight minutes later, car 21 went near car 35's home and didn't leave until 11:02 on January 11th, shown in Fig 5.3(b). We suspected that car 21 had changed shifts with car 15.

Table 5.3 Participant information of surveillance location C

晚上

 

晚上

Fig 5.3 Trajectories of Isia Vann (a) and Hennie Osvaldo (b) on January 10-11th

 

d) Surveillance location D: Ingrid Barranco's home

Evidence: At 23:08 on January 13th, car 21 went near car 4's home and didn't leave until 3:30 on January 14th, shown in Fig 5.4(a). One minutes later, car 24 went near car 4's home and didn't leave until 7:47 on January 14th, shown in Fig 5.4(b). We suspected that car 21 had changed shifts with car 24.

Table 5.4 Participant information of surveillance location D

晚上

 

晚上

Fig 5.4 Trajectories of Hennie Osvaldo(a) and Minke Mies(b) on January 13-14th

 

5.2-- Gathering activities

a) Location: Desafio Golf Course

At 13:00 on January 12th and 13:00 on January 19th, several executives went to Desafio Golf Course.

Table 5.5 Information of executives involed in the activities

晚上

 

晚上

Fig 5.5 Gathering activities of several executives on January 12th(a)-19th(b)

 

5.3-- Suspicious locations involving truckers

The car trajectories of all employees in Security department show that the locations unknown_ 6, unknown_ 7, unknown_ 8, unknown_ 9 and unknown_ 10 are suspicious. Participants involed are listed in table 5.6 (members of a secret organization in Sec 4.3). The reason why we list them as suspicious locations are as follows.

a) The participants appeared regularly in these locations, all around 11:00-12:00.

b) The participants are all four people in table 5.6, and they are all employees in Security department.

c) These locations are in remote areas.

Table 5.6 Information about people in suspicious locations

晚上

 

晚上

Fig 5.6 (a) Related cars to suspicious locations. (b) Locations of suspicious locations.

 

5.4-- Move to the company at midnight

Nils Calixto went to GAStect for four times at midnight, shown in Table 5.7 and Fig 5.7.

Table 5.7 Information of Nils Calixto who went to GAStect at midnight

晚上

 

晚上

Fig 5.7 Nils Calixto's four visits to GAStect at midnight

 

5.5-- Abnormal activities of truckers

Table 5.8 describes in detail the information about several abnormal activities truckers involved.

Table 5.8 Details about abnormal activities of truckers

晚上

 

晚上

Fig 5.8 Trajectories of truck 104, 105 and 107

 

5.6-- Card Thief

a) Location: Frydos Autosupply n' More

At 19:20 on January 13, credit card of Car 1 had up to $10,000 in purchases at Frydos Autosupply n' More. However, Fig 5.10 (a) indicates that car 1 was located at Ouzeri Elian at that time and the owner of this car might be for dinner. The only car that went to Frydos Autosupply n' More at that time was car 24, shown in Fig 5.10 (b), and the owner of car 24 used only the loyal card for consumption. So it is suspected that theft has occurred.

Table 5.9 Information on the persons involved in the card theft

晚上

 

晚上

Fig 5.9 (a) Trajectory and consumption information of Nils Calixto at 19:20 on January 13. (b) Trajectory and consumption information of Minke Mies at 19:20 on January 13.

 

5.7-- Location involving Abnormal Consumption

Location: Frydos Autosupply n' More

Loyalty cards involved: L2247, L9406, L9018, L2343, L8328, L4034, L6110 Evidence: The company didn't allocate cars to the owners of these loyalty cards, but their loyalty cards had the consumption records of Frydos Autosupply n' More (garage). Some cards had multiple records.

晚上

Fig 5.10 Consumption records in garage of employees without cars.

 

6 -- If you solved this mini-challenge in 2014, how did you approach it differently this year?

We did not participate in this mini-challenge in 2014.